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f*"** ■ Abstract 

The approach of Kleitman (1970) and Kanter (1976) to multivariate concentration func- 
tion inequalities is generalized in order to obtain for deviation probabilities of sums of inde- 
p ^ pendent symmetric random variables a lower bound depending only on deviation probabilities 

of the terms of the sum. This bound is optimal up to discretization effects, improves on a 
result of Nagaev (2001), and complements the comparison theorems of Birnbaum (1948) and 
Pruss (1997). Birnbaum's theorem for unimodal random variables is extended to the lattice 
case. 
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1 Introduction 

O 

For deviation probabilities f(\S\ > t) of sums S = Y17=i^ °^ independent, real-valued, and 

symmetrically distributed random variables Xi, Nagaev (2001, Theorem 1, in different notation) 

O . obtained the lower bound 

\Q ■ 

O' (1) P(|S|>t) > Y 2- fc B p ({A:}) (te[0,nh[) 



C3 



k>t/h 



where h 6 ]0, oo[ is a free parameter and 

n 

(2) B p := * B Pz 

i=i 

is the convolution of the Bernoulli distributions B Pi = (1 — pi)So +Pi5\ with success probabilities 
Pi := P(|-Xj| > h). Nagaev also provided analytically more tractable lower bounds for the right 
hand side of (JJ) and showed that the resulting inequalities for P(|5| > t) effectively complement 
other bounds depending on second and third absolute moments of the random variables Xj. 

The main purpose of the present note is to provide as Theorem 12.41 below a generalization 
of Kanter 's (1976) concentration function inequality for sums of independent and symmetric 
random vectors, which yields as Corollary 12.61 below in particular the following improvement of 
(|T|). under the same assumptions as above: 

(3) n\S\>t) > V (i- 2 - k F k (ll + l\))B p ({k}) (t£{0,nh[) 



k>t/h 

Here and below, we use the standard notations |_^J := max{/c 6 Z : k < x} and \x] := — L — : 
and write 

r+m— l , \ 

(4) FJm) := max ( ) (n,mGN ) 

rGZ ^— ' V i I 

i=r v ' 
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for the sum of the m largest binomial coefficients of order n. For t £ [(n—l)h, nh[, the inequalities 
in © and © are identical, while for t £ [0, (n — l)h[ and B p ({n — 1}) > 0, inequality © is 
strictly sharper than ©. Moreover, as follows from the proof of Corollary 12. fi( inequality © 
is optimal up to discretization effects, in the sense that, subject to the stated assumptions, the 
right hand side of © is the greatest lower bound for F(\S\ > t) + ^P(|S"| = t) for every t = mh 
with to € {1, . . . , m}. 

The rest of this note is structured as follows. Section |2] develops the Kleitman-Kanter ap- 
proach to multivariate concentration function inequalities. A specialization to the one-dimensional 
case, namely Corollary 12. 51 immediately yields the above-mentioned Corollary 12 . 61 improving Na- 
gaev's result. Section |31 reformulates Corollary 12.51 as a comparison theorem, stated together 
with related results of Pruss (1997) and Birnbaum (1948). The latter is generalized to the lattice 
case. Historical remarks are collected in Section 0J 

2 A generalized Kanter inequality 

Let || • || be a seminorm on an M- vector-space E and let | • | denote the usual absolute value 
on R. We write N := {1, 2, 3, . . .} and N := {0} U N. 



2.1 Lemma. Let a G E, to <G N and Ci, . . . , C m C E with 
(5) x,y£Cj =4> \\x — y\\ < \\a\\ 

for each j £ {1, . . . , to}. Then for some r E {1, . . . , to} the translate C r — a is disjoint from 

Proof. We may assume that D := Uj=i ^ $ an d || a ll > 0. In the special case E = R, 
|| • || = | • | and a > 0, we choose r such that minL> = minC r if minD exists, and inf D = inf C r 
otherwise. In the general case we apply the Hahn-Banach theorem (compare e.g. Rudin (1991), 
Theorem 3.3 and its Corollary) to yield a linear functional I on E with t[a) = \\a\\ and |^(x)| < 
||x|| for every x £ E, so that the special case applied to 1(a) > and £(Ci) , . . . , £(C m ) C R 
yields the claim. □ 



2.2 Lemma. Forn,m £ No, we have F n (to) = Y2t=r iT) w ithr = r n ^ m := [(n — m + l)/2j 
and s = r + to — 1, and aiso with \(n — to + l)/2] in place of r n ^ m . Further, 

(6) F n (m) = F n _i(m- 1) +F n _ 1 (m + 1) (n,m£N) 

and n i— > 2~ n F n (m) is for every to G No a decreasing function. 



Proof. The claim up to © follows easily from the symmetry, monotonicity and recursion 
properties of the binomial coefficients. The last claim follows, since the right hand side of © is 
< 2F n _!(m). □ 



We write $A for the cardinality of a set A. 



2.3 Theorem (essentially Kleitman's (1970) Theorem I). Let n,m G N, a\, . . . ,a n £ 

E, and C\, . . . , C m C E with 

n 

(7) x,y£Cj => ||x — y\\ < min ||aj|| 

i=l 

for each j G {1, . . . , to}. Then 

m 

(8) «{/c{l,...,n}:^a 1 G[JC J } < F n (m) 

iei 3=1 
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with equality for E = R, || • || = | ■ |, a\ = . . . = a n = 1, and Cj = { |_(n — m + l)/2j + j — 1}. 

Proof: We consider more generally n,m£No and let G n (m) denote the supremum of the 
left hand side of Q subject to the stated assumptions on a%, . . . ,a n and Ci, . . . , C m . Then 

(9) G n (0) = = F n (0) (n e N ) 

(10) G (m) = 1 = F (m) (m € N) 

Let n, m E N. Given ai, . . . , a n £ i? and Ci, . . . , C m C i£ with 0, let a := a n and choose r 
according to Lemma 12. II Then the left hand side of (JSJ) is $A with 



i=i j=i 



n-l 



7i m 

A := |e € {0,l} n : ^e 4 a; £ Q C,} =4iX {0} U A 2 x {1} U A 3 x {1} 

where 

:= {eeRir 1 : £>a; £ [J C,} 

i=l 3=1 
n-l 

1=1 

n-l 

A 3 := |e E {0, l}™" 1 : £jOj € (Cj — a ra )| 

with n A2 = and thus 

U < |Ai + ttA 2 + JIAa = U A 2 ) + (tA 3 < G n _i(m + 1) + G„_i(m-1) 
Hence we have 

(11) G n (m) < G n _i(m-1) + G n _i(m+1) (n,m€N) 

Now ©, ©, (HOJ) and dTTJ) together imply G n (fc) < F n {m) for all n,m E No, as was to be 
shown. The claim about equality is obvious. □ 

We call a random vector X symmetric if it has the same law as —X. We recall the definitions 
© and (@J) and put 

Q P := .*((l-K)5o + |(^l+^l)) (PG[0,1D 

2.4 Theorem (Kanter's (1976) Lemma 4.2 generalized). Let /i e]0,oo[, n,m E N, 
and p E [0, l] n . Then the supremum of 

n m 

p (E^ e U 6 i) 

»=i i=i 

taken over all measurable R- vector spaces E, measurable seminorms \\ ■ \\ on E, measurable sets 
Cx, . . . ,C m C E with 

x,y£Cj => ||x — y|| < 2/i 
for every j G {1, . . . , m}, and al/ independent and symmetric E-valued random vectors X$ with 

n\\Xi\\<h) < 1- Pi (t = l,...,n) 
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is attained for E = R, || ■ || = | ■ |, Cj = {0, h} + (2j — m — l)h, and the Xi symmetric M-valued 
with P(Xj = 0) = 1 — pi = 1 — P(|Xj| = /i). The value of the supremum is 

n 

(12) Qp([-m + l,m]) = ^ 2- fc F fe (m)B p ({A;}) 

fc=0 



Remark. Analytically convenient and sharp upper bounds for the quantity in (|12|) in the 
special case m = 1 are provided by Kanter (1976, Lemma 4.3) and by Mattner & Roos (2006). 
It is an open problem to prove analogous bounds for m > 2. 

Proof. We may assume h = 1. Let n etc. up to the Xi be as stated and let us put 
m := l-P(||Xi|| < 1). We may assume that X = (l-Bi)Yi+BiRiZi with Bx, . . . , B n , Yx, . . . , Y n , 
Z\, . . . , Z n , R\,..., R n independent, Bi ~ B Pi , Yi ~ P(X G • | \\Xi\\ < 1) := the conditional dis- 
tribution of Xi given \\Xi\\ < 1, Z { ~ PpQ G • | > 1), and P(ifc = -1) = P(i^ = 1) = 1/2. 
Then, with 5 := {B\, . . . , 5 n ), with Q denoting the law of (Y, Z) := (Yx, . . . , Y n , Zx, ■ ■ ■ , Z n ), 
and with |6| := Y17=l ^ii we have 

n m 

P(^Xe|J^) 

i=X j=l 

(is) = y, n^ = ^)/ip(E^^^^ G U^(^ + E^^-( 1 -^)^))) d ^^ z ) 

6G{o,i} n *=i J'=i «=1 

V v ' 

<2-l"lF |fc| (m) 

(14) < R.H.S.(H2J) with 7T instead of p 

(15) < R.H.S.(H2J) 

Here the inequality in (|13|) , and hence (|14|) , follows from Theorem 12.31 with those with bi = 1 
playing the role of the Oj, and with \{Cj + Y^i=x(pi z i ~ 0- ~ bi)yi)) m place of Cj. Inequality 
(115(1 is true since N 3 k ^ 2- k F k (m) is decreasing by Lemma 12.21 and [0, l] n 3 p i— > B p is 
increasing with respect to the coordinatewise order on [0, l] n and the usual stochastic order. In 
the special case E = M. etc. as stated, we have Yi ~ 5q and may replace the distribution of Zj 
by in deriving ()13|) . and hence get equality everywhere. □ 

2.5 Corollary. Let < h < H < oo with m := \H/h] < H/h + 1/2, n G N, and 
p£ [0, l] n . Then the supremum of 

n 

(16) F (5^ +a) 

i=l 

taken over all independent and symmetric M-valued random variables Xi with 

(17) F(\Xi\<h) < l-Pi (i = l,...,n) 

and aii a£l, is attained for P(Xj = 0) = 1— pi = 1 — ¥(\Xi\ = h) and a = m/i — ii". The value 
of the supremum is given in (|12|) . 

Proof. Given h, H,m,n,p, Xi and a as above, we have 

n 

(fTH|) < P(^X; G] -mh,mh] + b) 

i=l 

n m 

i=i j=i 

< R.H.S.dTJJ) 
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with b := a and Cj :=] — h, h] + (2j — m — l)/i + 6, using Theorem 12 . 41 with L 1 = M and || • || = • |. 
On the other hand, if P(Aj = 0) = 1 — Pi = 1 — P(|Aj| = h) and a = mh — H, and if we let 
b := instead of b := a, then we can replace the two inequalities in the above calculation by 
equalities, as the assumption m < H/h + 1/2 yields —mh < —H + a < — (m — □ 

2.6 Corollary. Let 5" = Y^7=i X% with independent and symmetric M- valued random vari- 
ables Xi and let h G ]0, oo[. Then © holds with pi := P(|X;| > h) for i = 1, . . . , n. 

Proof. For t > 0, we apply Corollary 12.51 with a = and H = mh with m := [t/h\ + 1 to 

get 

P(|5|<t) < P(5 e] -mh,mh]) < R.H.S.(HU) 
Inequality (j2J) follows by taking complements, since F k (m) = 2 k for k < m - 1. □ 



3 Comparison theorems 

For M-valued random variables U and V, we write U > s t V if ?7 is stochastically larger 
than V, that is, if F(U > t) > P(V > t) for every t £ R. A specialization of Corollary [231 
can be viewed as one of three results yielding at least almost a stochastic ordering \S\ > s t \T\ 
for sums 5, T of independent symmetric random variables assuming a corresponding ordering 
of their terms, the other two results being theorems of Pruss (1997) and Birnbaum (1948). It 
therefore appears natural to summarize these results here, and to use this opportunity to extend 
Birnbaum's theorem to the lattice case. 

Let us agree on the following unimodality definitions for laws P on R. We call P unimodal 
on M, if P is unimodal in the usual sense that, for some xo G M, the distribution function of P 
is convex on ] — oo,xo[ and concave on ]xo,oo[. For a G K and h G]0,oo[, we call P unimodal 
on KL + a, if P{KL + a) = 1 and if there is a ko G Z such that k \—> P{{hk + a}) is increasing on 
{k G Z : < fco} and decreasing on {/c G Z : A; > fco}. For h G [0, oo[, we call P unimodal with 
span h, if either h = and P is unimodal on 1, or ft > and P is unimodal on /tZ + a for 
some a G R. As usual, we attribute any property just defined to a random variable X if its 
distribution enjoys it. 

3.1 Theorem. Let n G N and let Xi, . . . , X n as well as Y±, . . . ,Y n be independent and 
symmetrically distributed R-valued random variables with sums S = Y%=l ^ i an( ^ ^ = 2^27=1 ^ 
and with 

(18) \Xi\ > st \Yi\ (i = l,...,n) 

(a) (Pruss (1997)) Then 

F(\S\>t) > -P(|T|>t) (*>0) 

(b) Ifh G]0,oo[ and¥(Yi G {-h,0,h}) = 1 for i = 1, . . . , n, then 

(19) F(\S\ > mh) + ^F(\S\ = mh) > P(\T\ > mh) + -¥(\T\ = mh) (m G N) 

(c) (Birnbaum (1948) generalized) Let h G [0, oof and X\, . . . , X n , Y\, . . . , Y n be uni- 
modal with span h. In case of h > assume further for each i G {1, . . . , n} that Xi, Yi are both 
hZ-valued or both h(Z + i)-vaiued. Then \S\ > st \T\. 
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See Berger (1997, Theorem 1.1) for a further related comparison theorem. 

Example. Let n = 2, X h X 2 ,Y 1 ~ + 5i), and y 2 = 0. Then \Xi\ > st for i = 1,2. 

Since P(|iS| > 1) = i and P(|T > 1) = 1, it follows that the constant | in Pruss' theorem is best 
possible. As each of the four random variables is unimodal with span 2, it also follows that the 



second sentence in part (c) can not be omitted. Further, in this example, P(Yj G {— 1, 0, 1}) = 1 
for % = 1, 2 but P(S* > 0) + ip(5 = 0) = § ~£ 1 = P(T > 0) + ±P(T = 0), showing that in (QJ 
we may not replace N by Nq. 



Proof, [(a)] See Pruss (1997). 

|(b)| Here (fT%|) is equivalent to (JUJ) with pj = P(|Ii| = h),so that Corollary ESI with # = m/i 
and a = yields {HJ). 

|(c)| Induction based on Lemmas 13.21 and 13.31 given below. In the step from n — 1 to n, we 
may assume X±,..., X n , Yi, . . . , Y n to be independent, and conclude that 

n n— 1 n 

| >st | ^ + >st | ^| 

i=l i=l i=l 

by applying Lemma El 21 first to U\ := J27=i ^i, V\ '■= X n , W\ := Y n and then to U2 := i^,, 
V2 := Y^=i Xi, W2 := X^I^i 1 ^ ' observing that by Lemma \'6.'d\ the sum U\ is symmetric and 
unimodal with span /i, and that in case of h > the sums V2, W2 are both /iZ- valued or both 
h(Z + i)-valued. □ 

3.2 Lemma. Let U, V, W be symmetrically distributed M-valued random variables with 
U,V independent, U,W independent, and \V\ > s t \W\. Let h G [0, oof with U unimodal with 
span h. In case of h > let further V,W be both KL-valued or both h(% + ^)-vaIued. Then 

\u + v\> st \u + w\. 



Proof. We may assume that h £ {0,1}. In case of h = we put A := B := [0, 00 [, 
while for h = 1 we let A, B £ {N ,N + \) with P(|J7| G A) = P(|V| G B) = 1. Then for 
t G A + B := {a + (» : a £ A, b £ B} and denoting by Pjj etc. the laws of the random variables 
occuring as subscripts, we have 

P(|tf + V|<t) = / Pp([w-t,u + t])dP, V |(t;) 

< j P v {[ v -t,v + t\)6P m (v) 
= F(\U + W\<t) 

since in each case the function B B v *— > P u ([v— t, v+t]) is decreasing. As P(|?7+V| G A+B) = 1, 
this proves \U + V\ > st \U + W\. □ 



3.3 Lemma (Wintner). Let X and Y be independent M-valued random variables and 
let h £ [0, 00 [. If X and Y are symmetric and unimodal with span h, then so is X + Y . 

Proof. Obvious by writing the laws of X and Y as mixtures of uniform distributions on 
symmetric intervals in R or KL or /i(Z + ^). See Dharmadhikari & Joag-Dev (1988, pp. 13 and 
109) for the cases where h = or X and Y are both symmetric unimodal on KL. The remaining 
three cases are analogous. □ 
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4 Historical notes 

Theorem 12.31 in the Hilbert space case, and assuming the sets Cj to be slightly smaller than 
necessary, was proved by Kleitman (1970), generalizing several earlier results and in particular 
the one-dimensional case due to Erdos (1945, Theorems 1 and 3). Jones (1978, page 4, footnote 7) 
observed that Kleitman's result and proof extends to general (semi-)normed spaces. Meanwhile, 
Kanter (1976, Lemma 4.1) proved a weaker result, assuming in particular symmetry of the sets 
Cj. The present proof of Theorem 12.31 is just a slightly refined rewrite of Kleitman's proof and 
Jones' footnote. 

Kanter (1976) essentially stated and proved Theorem 12.41 for m = 1 and C\ symmetric. Le 
Cam (1986, pp. 408-409) adopted Kanter's approach. 

Theorem I3.l|tc)| in the case of h = and without atoms at zero is due to Birnbaum (1948). 
Bickel & Lehmann (1976) and Shaked & Shantikumar (1994, page 78) allowed atoms at zero 
in their statements, but apparently not in their proofs. Sherman (1955) extended Birnbaum's 
result to the absolutely continuous multivariate case. Dharmadhikari & Joag-Dev (1988, p. 164) 
gave an elegant development of Sherman's theorem, dispensing with unnecessary continuity 
assumptions. They also essentially stated without proof Theorem 13. l|tc)| for h > in the case 
where all random variables are /iZ-valued. 

References 

Berger, E. (1997). Comparing sums of independent bounded random variables and sums of Bernoulli random 
variables. Statistics & Probab. Letters 34, 251-258. 

Bickel, P.J. & Lehmann, E.L. (1976). Descriptive statistics for nonparametric models. III. Dispersion. Ann. 
Statist. 4, 1139-1158. 

Birnbaum, Z.W. (1948). On random variables with comparable peakedness. Ann. Math. Statist. 19, 76-81. 

Dharmadhikari, S. & Joag-Dev, K. (1988). Unimodality, Convexity, and Applications. Academic Press, San 
Diego. 

Erdos, P. (1945). On a lemma of Littlewood and Offord. Bull. Amer. Math. Soc. 51, 898-902. 

Jones, L. (1978). On the distribution of sums of vectors. SIAM J. Appl. Math. 34, 1-6. 

Le Cam (1986). Asymptotic Methods in Statistical Decision Theory. Springer- Verlag, New York. 

Kanter, M. (1976). Probability inequalities for convex sets and multidimensional concentration functions. J. 
Multivariate Anal. 6, 222-236. 

Kleitman, D. (1970). On a lemma of Littlewood and Offord on the distributions of linear combinations of 
vectors. Advances in Math. 5, 155-157. 

Mattner, L. & ROOS, B. (2006). A shorter proof of Kanter's Bessel function concentration bound. Preprint. 
Available at arXiv math.PR/0603522 

Nagaev, S.V. (2001). Lower bounds for probabilities of large deviations of sums of independent random vari- 
ables. Theory Probab. Appl. 46, 728-735. 

PRUSS, A.R. (1997). Comparisons between tail probabilities of sums of independent symmetric random vari- 
ables. Ann. Inst. Henri Poincare 33, 651-671. 

Rudin, W. (1991). Functional Analysis. 2nd ed. McGraw-Hill, N.Y. 

Shaked, M. & Shantikumar, J.G. (1994). Stochastic Orders and their Applications. Academic Press, San 
Diego. 

Sherman, S. (1955). A theorem on convex sets with applications. Ann. Math. Statist. 26, 763-767. 

Universitat zu Lubeck 
Institut fur Mathematik 
Wallstr. 40 
D-23560 Lubeck 
Germany 

Email: mattner@math . uni-luebeck . de 



